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Abstract 

We continue the investigation of locally testable codes, i.e., error-correcting codes for whom 
membership of a given word in the code can be tested probabilistically by examining it in very 
few locations. We give two general results on local testability: First, motivated by the recently 
proposed notion of robust probabilistically checkable proofs, we introduce the notion of robust 
local testability of codes. We relate this notion to a product of codes introduced by Tanner, 
and show a very simple composition lemma for this notion. Next, we show that codes built by 
tensor products can be tested robustly and somewhat locally, by applying a variant of a test and 
proof technique introduced by Raz and Safra in the context of testing low-degree multivariate 
polynomials (which are a special case of tensor codes). 

Combining these two results gives us a generic construction of codes of inverse polynomial 
rate, that are testable with poly-logarithmically many queries. We note these locally testable 
tensor codes can be obtained from any linear error correcting code with good distance. Pre- 
vious results on local testability, albeit much stronger quantitatively, rely heavily on algebraic 
properties of the underlying codes. 

1 Introduction 

Locally testable codes (LTCs) are error-correcting codes that admit highly efficient probabilistic 
tests of membership. Specifically, an LTC has a tester that makes a small number of oracle accesses 
into an oracle representing a given word w, accepts if w is a codeword, and rejects with constant 
probability if w is far from every codeword. LTCs are combinatorial counterparts of probabilistically 
checkable proofs (PCPs), and were defined in [18, 25, 2], and their study was revived in [20]. 

Constructions of locally testable codes typically come in two stages. The first stage is algebraic 
and gives local tests for algebraic codes, usually based on multivariate polynomials. This is based 
on a rich collection of results on "linearity testing" or "low-degree testing" [1, 3, 4, 5, 6, 7, 8, 9, 13, 
14, 16, 17, 18, 20, 23, 25]. This first stage either yielded codes of poor rate (mapping k information 
symbols to codewords of length exp(A:)) as in [14], or yielded codes over large alphabets as in [25]. 
To reduce the alphabet size, a second stage of "composition" is then applied. In particular, this 
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is done in [20, 13, 11] to get code mapping k information bits to codewords of length /c^+°(^), over 
the binary alphabet. This composition follows the lines of PCP composition introduced in [4], but 
turns out to be fairly complicated, and in most cases, even more intricate than PCP composition. 
The one exception is in [20, Section 3], where the composition is simple, but based on very specific 
properties of the codes used. Thus while the resulting constructions are surprisingly strong, the 
proof techniques are somewhat complex. 

In this paper, we search for simple and general results related to local testing. A generic (non- 
algebraic) analysis of low-degree tests appears in [19], and a similar approach to PCPs appears in 
[15]. Specifically, we search for generic (non-algebraic) ways of getting codes, possibly over large 
alphabets, that can be tested by relatively local tests, as a substitute for algebraic ways. And we 
look for simpler composition lemmas. We make some progress in both directions. We show that 
the "tensor product" operation, a classical operation that takes two codes and produces a new one, 
when applied to linear codes gives codes that are somewhat locally testable (See Theorem 2.6). 
To simplify the second stage, we strengthen the notion of local testability to a "robust" one. This 
step is motivated by an analogous step taken for PCPs in [11], but is naturally formulated in our 
case using the "Tanner Product" for codes [27]. Roughly speaking, a "big" Tanner Product code 
of block-length n is defined by a "small" code of block-length n' = o{n) and a collection of subsets 
5i, . . . , Sm C [n], each of size n'. A word is in the big code if and only if its projection to every 
subset Si is a word of the small code. Tanner Product codes have a natural local test associated with 
them: to test if a word w is a codeword of the big code, pick a random subset Sj and verify that w 
restricted to Sj is a codeword of the small code. The normal soundness condition would expect that 
if w is far from every codeword, then for a constant fraction of such restrictions, w restricted to Sj 
is not a codeword of the small code. Now the notion of robust soundness strengthens this condition 
further by expecting that if w is far from every codeword, then many (or most) projections actually 
lead to words that are far from codewords of the small code. In other words, a code is robust if 
global distance (from the large code) translates into (average) local distance (from the small code). 
A simple, yet crucial observation is that robust codes compose naturally. Namely, if the small code 
is itself locally testable by a robust test (with respect to a tiny code, of block-length o(n')), then 
distance from the large code (of block-length n) translates to distance from the tiny code, thus 
reducing query complexity while maintaining soundness. By viewing a tensor product as a robust 
Tanner product code, we show that a (log N/ log log A'")-wise tensor product of any linear code of 
length n = poly log A'' and relative distance 1 — ^^^^ ~ ^i which yields a code of length N and 
polynomial rate, is testable with poly(logA) queries (Theorem 2.7). Once again, while stronger 
theorems than the above have been known since [6], the generic nature of the result above might 
shed further light on the notion of local testability. 

Organization. We give formal definitions and mention our main theorems in Section 2. In 
Section 3 we analyze the basic tester for tensor product codes. Finally in Section 4 we describe our 
composition and analyze some tests based on our composition lemma. 

2 Definitions and Main Results 

Throughout this paper S will denote a finite alphabet, and in fact a finite field. For positive integer 
n, let [n] denote the set {1, . . . ,n}. For a sequence x G S** and i G [n], we will let Xj denote the 
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ith element of the sequence. The Hamming distance between strings x,y E S", denoted A{x,y), is 
the number of z G [n] such that Xi ^ yi. The relative distance between x,y E T,^, denoted 6{x,y), 
is the ratio A{x,y)/n. 

A code C of length n over S is a subset of E"^. Elements of C are referred to as codewords. 
When S is a field, one may think of as a vector space. If C is a linear subspace of the 
vector space S", then C is called a linear code. The crucial parameters of a code, in addition to 
its length and the alphabet, are its dimension (or information length) and its distance, given by 
A(C) = mina;^ygC'{A(a;, y)}. A linear code of dimension k, length n, distance d over the alphabet 
S is denoted an [n,k,d]j: code. For a word r G E" and a code C, we let 5c{r) = mmxec{^{r,x)}. 
We say r is (^'-proximate to C {S'-far from C, respectively) if Sc{r) > S' {Sc{r) > S', respectively). 

Throughout this paper, we will be working with infinite famihes of codes, where their performance 
will be measured as a function of their length. 

Definition 2.1 (Tester) A tester T with query complexity q{-) is a probabilistic oracle machine 
that when given oracle access to a string r G E", makes q{n) queries to the oracle for r and returns 
an accept/reject verdict. We say that T tests a code C if whenever r E C, T accepts with probability 
one; and when r ^ C, the tester rejects with probability at least Sc{r)/2. A code C is said to be 
locally testable with q{n) queries if there is a tester for C with query complexity q{n). 

When referring to oracles representing vectors in E", we emphasize the queries by denoting the 
response of the ith query by r[i\ , as opposed to . Through this paper we consider only non-adaptive 
testers, i.e., testers that use their internal randomness R to generate q queries ii, . . . ,iq G [n] and 
a predicate P : E'^ ^ {0, 1} and accept iff P(r[ii], . . . ,r[ig]) = 1. 

Our next definition is based on the notion of Robust PGP verifiers introduced by [11]. We need 
some terminology first. 

Note that a tester T has two inputs: an oracle for a received vector r, and a random string s. On 
input the string s the tester generates queries ii, . . . , ig G [n] and fixes circuit C = Cg and accepts 
if C(r[zi], . . . , r[zq]) = 1. For oracle r and random string s, define the robustness of the tester T on 
r, s, denoted p^{r, s), to be the minimum, over strings x satisfying C(x) = 1, of relative distance of 

(r[ii], . . . ,r[iq]) from x. We refer to the quantity p^{r) =^ Y^s[p^ {r^s)] as the expected robustness 
of T on r. When T is clear from context, we skip the superscript. 

Definition 2.2 (Robust Tester) A tester T is said to be a-robust for a code C if for every r EC, 

the tester accepts w.p. one, and for every r G E", p^{r) > a ■ Sc{r). 

Having a robust tester for a code C implies the existence of a tester for C, as illustrated by the 
following proposition. 

Proposition 2.3 // a code C has a a-robust tester T for C making q queries, then it is locally 
testable with 0{q/a) queries. 
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Proof: Let c = [a"^] • The local tester T' for C is obtained by invoking T c times and accepting 
if all invocations accept. Consider a word r with Scir) = 5. For at least 5/c < a - 5 fraction of the 
choices of random strings s of T, it must be that /9^(r, s) > and T rejects. Thus the probability 
that T' does not reject in any of the c repetitions is at most 



The previous proposition shows that large robustness leads to small query complexity. However, 
there is a limit to the size of the robustness parameter as shown in the next claim. 

Proposition 2.4 If T is a a-robust tester for a linear code C gT," with minimal (non-relativized) 
distance at least two, then a < 1. 

Proof: W.l.o.g. T is a non-adaptive, i.e. the set of queries performed by T docs not depend on 
the received word r (only on the randomness s) [12]. Let Ti, . . . ,Ts be the set of possible tests 
performed by T, let pj be the probability Tj is performed, and let qj be the query complexity of 
Tj. Let Si be the set of tests that query i £ [n] and let wt(i) = Ylj^s^Pj / ^li ^-"^ weight of 
i € [n]. There must be some i with weight < 1/n because the sum of weights is one. Look at 
the word r that is zero everywhere but on the i^^ coordinate, where it is one. On the one hand 
^c{^) = ^/n, because C is a linear code of minimal distance > 1. On the other hand, the robustness 
of p^{r) = wt{i) < 1/n. Thus, the robustness parameter is at most one. I 

The main results of this paper focus on robust local testability of certain codes. For the first result, 
we need to describe the tensor product of codes. 

Tensor Products and Local Tests Recall that an [n, linear code C may be represented 

by a A; X n matrix M over S (so that C = {xM\x G S*^}). Such a matrix M is called a generator 
of C. Given an [ni, ki, di]-^ code Ci with generator Mi and an [n2, k2,d2]j] code C2 with generator 
M2, their tensor product (cf. [22], [26, Lecture 6, Section 2.4]), denoted Ci ® C2 C i;"2xni^ jg ^j^g 
code whose codewords may be viewed as n2 x ni matrices given explicitly by the set {M^XMi|X G 
5]fc2xfci| jg -vvell-known that Ci (g) C2 is an [nin2,kik2,did2]-E code. 

Tensor product codes are interesting to us in that they are a generic construction of codes with "non- 
trivially" local redundancy. To elaborate, every linear code of dimension k does have redundancies 
of size 0{k), i.e., there exist subsets of i = 0{k) coordinates where the code does not take all 
possible S* possible values. But such redundancies are not useful for constructing local tests; and 
unfortunately generic codes of length n and dimension k may not have any redundancies of length 
o{k). However, tensor product codes are different in that the tensor product of an [n,k,d]j^ code 




(By Inclusion-Exclusion) 



< l-S + c^/2{6/cf 
= l-d + 6'^/2 

< l-S + 6/2 
= l-d/2 



Thus words at distance 6 from codewords are rejected with probability at least S/2. I 
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C with itself leads to a code of dimension /c^ which is much larger than the size of redundancies 
which are 0(A;)-long, as asserted by the following proposition. 

Proposition 2.5 A matrix r G S^^x^i is a codeword of Ci ® C2 if and only if every row is a 
codeword of Ci and every column is a codeword of C2 ■ 

In addition to being non-trivially local, the constraints enumerated above are also redundant, in 
that it suffices to insist that all columns are codewords of C2 and only ^2 (prespecified) rows are 
codewords of Ci. Thus the insistence that other rows ought to be codewords of Ci is redundant, 
and leads to the hope that the tests may be somewhat robust. Indeed we may hope that the 
following might be a robust test for Ci (g) C2. 

Product Tester: Pick fee {1,2} at random and i G [n^] at random. Verify that r with 
6th coordinate restricted to z is a codeword of C3_{,. 

While it is possible to show that the above is a reasonable tester for Ci C2, it remains open if the 
above is a robust tester for Ci C2. (Note that the query complexity of the test is max{ni,n2}, 
which is quite high. However if the test were robust, there would be ways of reducing this query 
complexity in many cases, as we will see later.) 

Instead, we consider higher products of codes, and give a tester based on an idea from the work of 
Raz and Safra [24] . Specifically, we let denote the code C • ■ • (7 . We consider the following 

m 

test for this code: 

m-Product Tester: Pick b G [m] and i G [n] independently and uniformly at random. 
Verify that r with 6th coordinate restricted to i is a codeword of C^~^. 

Note that this tester makes N^~m queries to test a code of length N = n^. So its query complexity 
gets worse as m increases. However, we arc only interested in the performance of the test for small 
m (specifically m = 3,4). We show that the test is a robust tester for C'" for every m > 3. 
Specifically, we show 

Theorem 2.6 For a positive integer m and [n. A;, c/js-coc/e C, such that (^^)"* > \, m-Product 
Tester is 2'^^ -robust for . 

This theorem is proven in Section 3. Note that the robustness is a constant, and the theorem 
only needs the fractional distance of C to be sufficiently large as a function of m. In particular 
a fractional distance of 1 — suffices. Note that such a restriction is needed even to get the 
fractional distance of to be constant. 

The tester however makes a lot of queries, and this might seem to make this result uninteresting (and 
indeed one doesn't have to work so hard to get a non-robust tester with such query complexity). 
However, as we note next, the query complexity of robust testers can be reduced significantly 
under some circumstances. To describe this we need to revisit a construction of codes introduced 
by Tanner [27]. 
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Tanner Products and Robust Testing The robustness of the m-Product Tester above seems 
to be naturahy related to the fact that the tester's predicates are testing if the queried points 
themselves belong to a smaller code. (In the case of the m-Product Tester, it verifies that the 
symbols it reads give a codeword of the code C'^~^.) The notion that a bigger code (such as C"^) 
may be specified by requiring that certain projections of a word fall in a smaller code (such as 
(jm-i^ is not a novel one. Indeed this idea goes back to the work of Tanner [27], who defined this 
notion in its full generality and considered big codes obtained by a "product" of a bipartite graph 
with a small code. This notion is commonly referred to in the literature as the Tanner Product, 
and we define it next. 

For integers (n, m, t) an (n, m, t)-ordered bipartite graph is given by n left vertices [n], and m right 
vertices, where each right vertex has degree t and the neighborhood of a right vertex j G [m] is 
ordered and given by a sequence ij = {ij^i, ■ ■ ■ , ij^t) with ij^i £ [n]. 

A Tanner Product Code (TPC), is specified by an [n,m,t] ordered bipartite graph G and a code 
Csmaii ^ S*- The product code, denoted TPC(G = {ii, . . . ,£m}j C'smaii) ^ is the set 

{r e I r|,, in.,,. . . ,r,.,) G Cs^,n, Vj G [m]}. 

Notice that the Tanner Product naturally suggests a test for a code. "Pick a random right vertex 
j G [m] and verify that r\£. G Csmaii-" Associating this test with such a pair (G, Cgmaii)) we say 
that the pair is a-robust if the associated test is a a-robust tester for TPC(G, Cgmaii)- 

The importance of this representation of tests comes from the composability of robust tests coming 
from Tanner Product Codes. Suppose (G, Csmaii) is a-robust and Cgmall is itself a Tanner Product 
Code, TPC(G', Csmaii') where C' is an (d, m/, t')-ordered bipartite graph and (C, Csmaii') is a'- 
robust. Then TPC(G, Cgmaii) has an a ■ a'-robust tester that makes only t' queries. (This fact is 
completely straightforward and proven in Lemma 4.1.) 

This composition is especially useful in the context of tensor product codes. For instance, the 
tester for is of the form (G, C^), while has a robust tester of the form {G',C^). Putting 
them together gives a tester for C^, where the tests verify appropriate projections are codewords of 
C^. The test itself is not surprising, however the ease with which the analysis follows is nice. (See 
Lemma 4.2.) Now the generality of the tensor product tester comes in handy as we let C itself be 
C'^ to see that we are now testing C'^ where tests verify some projections are codewords of C'^. 
Again composition allows us to reduce this to a C'^-test. Carrying on this way wc see that we can 
test any code of the form C^ by verifying certain projections are codewords of C^. This leads to 
a simple proof of the following theorem about the testability of tensor product codes. 

Theorem 2.7 Let {Ci}i be any infinite family of codes with Ci a [n^, fcj, djjsi code, with ni = p{ki) 

for some polynomial p(-). Further, let ti he a sequence of integers such that rrii = 2** satisfies 
di/ui > 1 — Then the sequence of codes {G'^ = G™'''}i is a sequence of codes of inverse 

polynomial rate and constant relative distance that is locally testable with poly logarithmic number 
of queries. 

This theorem is proven in Section 4. We remark that it is possible to get code families Gi such as 
above using Reed-Solomon codes, as well as algebraic-geometric codes. 
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3 Testing Tensor Product Codes 



Recall that in this section we wish to prove Theorem 2.6. We first reformulate this theorem in the 

language of Tanner products. 

Let G'!^ denote the graph that corresponds to the tests of C"' by the m-Product Tester, where 
CCS". Namely has n"* left vertices labelled by elements of [n]"^. It has m • n right vertices 
labelled {b,i) with b G [m] and i £ [n]. Vertex {b,i) is adjacent to all vertices (ii, . . . such 
that ih = i. The statement of Theorem 2.6 is equivalent to the statement that (G"j,C™~"^) is 
2~^^-robust, provided (^^) > |- The completeness of the theorem follows from Proposition 2.5, 
which implies = TPC(G^, C"^~^). For the soundness, we first introduce some notation. 

Consider the code Ci (E> ■ ■ ■ (E) Cm, where Cj = [ni,ki,di]-s code. Notice that codewords of this 
code lie in S"!^'"^"'". The coordinates of strings in S"!^'"^"'" are themselves m-dimensional 

vectors over the integers (from [rii] x ••• x [nm])- For r G Y,nix---xnm g^j^^j with ij G 

[uj], let r[ 

^i)---)^m] denote the (ii, . . . , i^)-th coordinate of r. For b G [m], and i G [ni,], let 
r^^j G '£"'i^--->^'^b-i>^nb+ix-xnm ^]jg vector obtained by projecting r to coordinates whose 6th 
coordinate is i, i.e., n^ilii, . . ■ ,im-i] = r[ii, ■ ■ . ,ib-i,i,ib, ■ ■ • ,im-i]- 
The following simple property about tensor product codes will be needed in our proof. 

Proposition 3.1 For 6 G {!,..., m} let Cb be an [ub, kb, dbl^, code, and let lb be a set of cardinality 
at least Ub — db + l- Let C'^ be the code obtained by the projection of Cb to lb- Then every codeword 
c' of C = C[ • • • C'^ can be extended to a unique codeword c of C = Ci® ■ ■ ■ ® Cm ■ 

Proof: The projection of Cb to is bijective. It is surjective because it is a projection, and it is 
injective because \Ib\ > — db- So, the projection of C to C is a bijection, because both codes are 
of dimension H^i ^b- Thus, every word in C has a unique preimage in C. I 

Recall that the m-Product tester picks a random h G [m] and i G [n] and verifies that rb,i G C""~^. 
Let p{r, (6, i)) denote the expected distance of the view of this tester when accessing oracle r on 

random string Note that p{r,{b,i)) = 5cm-i{rb^i)- Let p{r) = Efe,j[<^c™-i ('"6,i)]- wish to 

show for every r that p{r) > 2~^^ ■ dcm{r) or equivalently Scm{r) < 2^^ • p{r). 

We start by first getting a crude upper bound on the proximity of r to (7™ and then we use the 
crude bound to get a tighter relationship. To get the crude bound, we first partition the random 
strings into two classes: those strings {b,i) for which p{r, {b,i)) is large, and those for which it is 
small. More precisely, for r G S" and a threshold r G [0,1], define the r-soundness-error of r 
to be the probability that 6(jm-i{rb^i) > r, when b G [m] and i G [n] are chosen uniformly and 
independently. Note that the y^-soundness error of r is at most for p = p{r). We start by 
showing that r is 0(r + e)-close (and thus also O(Y^)-close) to some codeword of C"*. 

Lemma 3.2 If the T-soundness-error ofr is e forT + 2e < ^ ■ {^^)^ , then 5cm{r) < 16- ^ • 

(r + e). 
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Proof: For every i £ [n] and b G [m], fix Cb^i to be a closest codeword from C"^~^ to Vh^i- We 
follow the proof outline of Raz &: Safra [24] which when adapted to our context goes as follows: (1) 
Given a vector r and an assignment of codewords Cb^i G C"^~^, we define an "inconsistency" graph 
G. (Note that this graph is not the same as the graph that defines the test being analysed. 
In particular G is related to the word r being tested.) (2) We show that the existence of a large 
independent set in this graph G implies the proximity of r to a codeword of C" (i.e., (5cm (r) is 
small). (3) We show that this inconsistency graph is sparse if the r-soundness-error is small. (4) 
We show that the distance of C forces the graph to be special in that every edge is incident to at 
least one vertex whose degree is large. 

Definition of G. The vertices of G are indexed by pairs {b,i) with b G [m] and i E [n]. Vertex 
(6i,zi) is adjacent to (62,^2) if o,t least one of the following conditions hold: 

1. 5^™- 1(^-61,^1) > r. 

2. Scm-i{rb,,42) > T. 

3. bi 7^ 62 and c^^ j-^ and cjj^jj inconsistent, i.e., there exists some element j = {ji, . . . , jm) £ 
[n]"^, with jb^ = ii and jb^ = 12 such that C(,j,jj/^)] / Cbj.ia b'^^^], where /^^^ G [n]™"^ is the 
vector j with its 6cth coordinate deleted. 

Independent sets of G and proximity of r. It is clear that G has mn vertices. We claim next 
that if G has an independent set I of size at least m{n — d) + d+ 1 then r has distance at most 

1 - (|I|/(mn))(l -r) to C™. 

Consider an independent set / = /i U • • • U /m in G with lb of size being the set of vertices of 
the form (5, G [n]. W.l.o.g. assume rii > • • • > rim- Then, we have ni,n2 > n — d (or else even 
if ni = n and n2 = n — d we'd only have J2b ''^b < n + {m — l){n — d)) . We consider the partial 
vector r' e s^xnx-xn (defined as r'[i, j2, ...,jm] = ci,i[j2, • • • ,im] for i G /i, and j2, ■■■,jm & H- 
We show that r' can be extended into a codeword of and that the extended word is close to r 
and this will give the claim. 

First, we show that any extension of r' is close to r: This is straightforward since on each coordinate 
z G /i, we have r agrees with r' on 1 — r fraction of the points. Furthermore I\/n\s at least {mn) 
(since n\ is the largest). So we have that r' is at most 1 — (|/|/(mn))(l — r) far from r. 

Now we prove that r' can be extended into a codeword of C"*. Let Gb = G\i^ be the projection 
(puncturing) of G to the coordinates in lb- Let r" be the projection of r' to the coordinates in 
/i x/2 X [n] X • • • X [n]. We will argue below that r" is a codeword of Ci(X'C2<X'C™'~^, by considering its 
projection to axis-parallel lines and claiming all such projections yield codewords of the appropriate 
code. Note first that the restriction of r' to any line parallel to the 6-th axis is a codeword of C, for 
every 6 G {2, . . . , m}, since ^ is a codeword of G"^~^ for every i G h. Thus this continues to hold 
for r" (except that now the projection to a line parallel to the 2nd coordinate axis is a codeword 
of C2). Finally, consider a line parallel to the first axis, given by restricting the other coordinates 
to (?2, . . . , im), with 12 & h- We claim that for every ii e h, r"[ii, . . . ,im] = C242 [h, ■ ■ ■ jim]- This 
follows from the fact that the vertices (l,ii) and (2,^2) are not adjacent to each other and thus 
implying that ci^j^ and 02,12 consistent with each other. We conclude that the restriction of r" 
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to every axis parallel line is a codeword of the appropriate code, and thus (by Proposition 2.5), r" 
is a codeword of Ci C2 C™~^. Now applying Proposition 3.1 to the code Ci ® and its 

projection Ci (8> C2 (8) C"*~^ we get that there exists a unique extension of r" into a codeword c' of 
the former. We claim this extension is exactly r' since for every i G Ii, c'^^ k] = r'[i,j, k]. Finally 
applying Proposition 3.1 one more time, this time to the code C"^ and its projection Ci (g) C"*~^, 
we find that r' = c' can be extended into a codeword of the former. This concludes the proof of 
this claim. 

Density of G. We now see that the small r-soundness-error of the test translates into a small 
density 7 of edges in G. Below, we refer to pairs (5, i) with b G [m] and i € [n] as "planes" (since 
they refer to (m — l)-dimensional planes in [n]™") and refer to elements of [n]™ as "points". We 
say a point p = {pi, . . . ,Pm) lies on a plane (b, i) if p^ = i- Now consider the following test: Pick 
two random planes (5i,ii) and (62,12) subject to the constraint 5i / 62 and pick a random point p 
in the intersection of the two planes and verify that Cfe^^j^ is consistent with r\p\. Let k denote the 
rejection probability of this test. We bound k from both sides. 

On the one hand we have that the rejection probability is at least the probability that we pick 

two planes that are r-robust and incident to each other in G (which is at least — 2e) and the 
probability that we pick a point on the intersection at which the two plane codewords disagree (at 
least (d/n)™^^), times the probability that the codeword that disagrees with the point function is 
the first one (which is at least 1/2). Thus we get k > 2[n)^-^ (m^ ~ • 

On the other hand we have that in order to reject it must be the case that either (^c^-i (^6i,n) > 
(which happens with probability at most e) or 5c-m.-i{rh^,i^) < r and p is such that r},^,i^ and C},^,i^ 
disagree at p (which happens with probability at most r). Thus we have k < t + e. Putting the 

two together we have 7 < ^2e + ^^^{t + e)^ . 

Structure of G. Next we note that every edge of G is incident to at least one high-degree vertex. 
Consider a pair of planes that are adjacent to each other in G. If either of the vertices is not 
r-robust, then it is adjacent to every vertex of G. So assume both are r-robust. 

W.l.o.g., let these be the vertices (l,i) and (2,j). Thus the codewords ci^j and C2j disagree on the 
(m — 2)-dimensional surface with the first two coordinates restricted to i and j respectively. Now 
let S = {{k^, ■■■ , kjn) I Ci^ilj, ks,---, km] 7^ C2,j[i, ks,... , km]} be the set of disagreeing tuples on this 
line. By the distance of C"^~'^ we know \S\ > But now if we consider the vertex {b,kb) in 

G for 6 G {3, . . . , m} and kh such that there exists ki, . . . , km-2 satisfying k = (fei, . . . , km-2) G 5', 
it must be adjacent at least one of (l,i) or (2,j) (it can't agree with both at the point {i,j,k). 
Furthermore, there exists d such Ar^'s for every 6 G {3, . . . , m}. Thus the sum of the degrees of (1, i) 
and {2,j) is at least {m — 2)d, and so at least one has degree at least (m — 2)d/2. 

Putting it together. From the last paragraph above, we have that the set of vertices of degree 
less than (m — 2)d/2 form an independent set in the graph G. The fraction of vertices of degree at 
least (m — 2)d/2 is at most 2{jmn)/{{m — 2)d). Thus we get that if mn-(l — 2{^mn)/{{m — 2)d)) > 
m{n—d)+d+\, thenr is 5-proximate to C"* for 5 < T+{l — T)-2{'jmn)/{{m—2)d). The lemma now 
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follows by simplifying the expressions above, using the upper bound on 7 derived earlier. Details 
below. 

We first focus on the condition |/| > m{n — d) + d + 1. It suffices to prove that 
mn- {l-2{jmn)/{{m-2)d)) > m{n-d) + d+l 



■^(m-l)d-l > 2 

= (m-l)(d-l) > 2 

= (m-l)(d-l) > 2 

^(d-l) > 2 



2e + 2 



n 



d-l 



n 



m-2 



d-l 



m— 2 



•(r + e)j < 

(r + 2e)j < 

- (r + 2e) < 

- (r + 2e) < 



2 2 
■ym n 



m-2)d 

2 2 



m - 2)d 



22 T 

m n m — 1 



m — 2)d m 



mn 



m-2)d 



mn 



2, 
ym 


-2)(d 


-1) 


(m - 


-2)(cZ- 






2mn^ 




(n? - 


-2){d- 






2m'n? 




m — 


2 fd 


-n 



2e + 2 



T + e) 

m-2 



d-l 



ir + e) 



2m \ n 
12 ' V n 



The above shows that the condition assumed in the lemma statement indeed is sufficient to establish 
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a large independent set. Next we simplify the proximity bound obtained. We have 

2'ymn 



5 < r + (l-T 

< T + 

< T + 

< r + 



(m - 2)d 

2'ymn 



(m - 2)d 
2mn m — 1 



{m — 2)d m 
2mn m — 1 /n\'"-2 



(m — 2)d m 



.2(-) .(r + 2.) 



4(m - 1) /n\™-i , 



m — 2 V d 
4(m — /"A 



< ^'"'-".f^r~'.(r+,) 



m — 2 \d, 

rn—l 



I 

Next we improve the bound achieved on the proximity of r by looking at the structure of the graph 
(the graph underlying the m-Product tester) and its "expansion". Such improvements are a 
part of the standard toolkit in the analysis of low-degree tests based on axis parallel lines (see e.g., 
[7, 6, 16, 17] etc.) We follow the proof outline of [17] which in turn uses a proof technique of [10]. 

First, some notation: Fix n and m and the graph G^. Let L and R denote the left and right 
vertices of G^. Let dL and dR denote its left and right degrees. And let E denote the edges of • 
Note \L\ = n™, \R\ = mn, dz = rn and dji = n^"^. In particular, dz ■ \L\ = dji ■ \R\. For a set 
A C L (J R, let T{A) = {{u, v) e E \ u e A,v ^ A}. Using this notation, we have the following: 

Lemma 3.3 Fix n,m > 3 and let L, R denote the two partitions of the vertices of G"^ and let 
dL,dji denote the left and right degrees. Let S C. L and T C R be such that jxj ^ i- Then 
|r(5UT)| >^-\S\ + ^- \T\. 



Proof: We start with a simple observation that also allows us to bound the size of T. Suppose, 
|T| > \R\/2. Then the number of edges leaving T is at least dn ■ \T\ > dR ■ (|i?|/2). On the other 
hand the number of edges entering S is at most c?l • l^l < dL ■ {\L\/A). Thus in this case, we have 

r(5UT) > dR-{\R\/2)-dL-{\L\/A) 
= dR-i\R\/A) 

= dR-{\R\/8)+dL-m/8) 
> dR-{\S\/8) + dL-{\T\/8). 
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We are thus reduced to the case where |T|/|i?| < ^. Here, we follow the proof of Babai and 

Szegcdy [10]. (See also [21]). The crucial fact needed to apply their proof is that the graph 
is edge-transitive, i.e., for every pair of edges ei,e2 in G^, there is an automorphism of that 
maps ei to 62. This fact is used as follows: Let A denote the set of all automorphisms of G^. Then 
if we consider any fixed edge e G and all its images under automorphisms ^ as a multiset, then 
every edge of appears exactly the same number of times. 

Armed with this fact, the proof proceeds as follows: For every pair u £ L and v £ R define 
a canonical shortest path Pu,v Note that this path has length at most three. Note that an 
automorphism from A maps a path in iJ" to a path in H^. Now consider the multiset V of 
all paths obtained by taking the paths Pu,v for every u,v, and their automorphisms for every 
automorphism in A. The cardinality of V is thus \A\ ■ \L\ ■ \R\. The symmetry over the edges 
implies that every edge in E has exactly the same number, say A'^, of paths from V passing through 
them. Since each path has at most three edges, we have N < ^^"^l^^^ = ^-^^^1 or equivalently 

\A[ > JlL_ _ 

N - 3-\R\ ~ 3-\L\- 

Now consider the set of paths C P whose endpoints involve exactly one element of 5 U T. We 
have the cardinality of V equals \A\ ■ {\S\ ■ \T\ + • \T\) (where S = L-SandT = R-T). On 
the other hand, we have \T"\ < N ■ \T{S U T)\ 

Combining the two we have 

|r(suT)| > ^.\v'\ 

> \^.{\S\.\T\ + \S\.\T\) 

> + f -m. 



This proves the lemma. I 



Lemma 3.4 Let m be a positive integer and C be an [n, k, d]s code with the property ^ /ri^ ^ > 
|. //r G S""" and c G satisfy 6{r, c) < I then (5(r, c) < 8p(r). 

Proof: Let L,R denote the two partitions of the vertices of G^. Note that the right vertices 
of GJ^ are of the form (5, i), with b G [m] and i G [n]. Let r^^j denote the projection of r to the 
neighborhood of the right vertex {b,i), and let C(,^j denote the projection of c to the same. Let c^^ 
denote the codeword of G"^~^ closest to rft^j. Call an edge {u, {b, i)) of G^ bad if r and c|, ^ disagree 
at u. Note that the fraction of bad edges equals p{r). 

We now lower bound p(r) in terms of 6{r,c). For this part we use Lemma 3.3. Let 5 C L be 
the set of vertices (ii, . . . , i^) for which r[ii, . . . , i„i] ^ c[?']^, . . . , ir^^] • Note that by assumption 
]5]/]L| = S{r,c) < ^. Let T C Rhe the set of vertices (6, i) for whom Cb^i / c^j. By Lemma 3.3 we 
have |r(S' U T)| > ^ • |5| + ^ • \T\. We now claim that most of these edges are bad. 
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Consider first an edge (n, {b,i)) in GJ^ from S to T. On the one hand c'^- = Ch^i and on the other 
r[n] 7^ c[n]. This leads to a disgreemcnt between r and c' at u and so such an edge is bad. Next, 
consider an edge (u, (6, i)) from u G 5 to T. We do have r[u\ = c[u\ and c^^ 7^ Cft^j, but this 
doesn't imply that {u,{h,i)) is bad, since Cb^j and c[,j need not disagree at u. Indeed for every 
{h,i) G T, there may be up to n^~^ — (F^~^ edges (n, (6, z)) for which c^^- and r agree at u, but 
remaining edges out of (6, i) are bad. Discounting for these edges, we see that all but at most 
^j^m-i _ gjm-i-j . jj^i gfiggg from T to 5 are bad. Thus we get that the number of bad edges is 
at least ^ • |5| + • |T| - (n"'"! - (T-^) ■ \T\. Using dR = n"'-'^ and d"'-^/n"'-'^ > |, we 
get ^ • |r| - {nr-'^ - d""-^) ■ \T\ > 0. Thus we get that the fraction of bad edges P is at least 
I • {\S\/\L\) = We conclude S{r, c) < 8 • p(r). I 

We are now ready to put the pieces together to prove Theorem 2.6. 

Proof of Theorem 2.6: Let a = 2'^"^ ■ {^) We wiU prove that the m-Product Tester is 
a-robust for C"^. Note that a > 2"^^ as required for the theorem, and ^/a < min{^ • (^^)'" , • 
in)"^ ""^l (as will be required below). 

The completeness (that codewords of C"* have expected robustness zero) follows from Proposi- 
tion 2.5. For the soundness, consider any vector r G S'* and let p = p{r). li p > a, then there 
there is nothing to prove since p/a > 1> 6cm{r). So assume p < a. 

Note that r has y^-soundness-error at most y^. Furthermore, by the assumption on p, we have 
3y/p < S-^/a < • (^^)™ and so, by Lemma 3.2, we have 5c^{r) < 16 • (^)"^ ^ • 2 • ^/p. Now 
using < ^/a < j^g " (n)™ ^' S"^^ <^C™(?') ^ |- Let u be a codeword of C" closest to r. We 
now have 6{r,v) < | and (^)'" ^ ^ |) ^iid so, by Lemma 3.4, we get dcm{r) = 6{r,v) < 8p. This 
concludes the proof. I 

4 Tanner Product Codes and Composition 

In this section we define the composition of two Tanner Product Codes, and show how they preserve 
robustness. We then use this composition to show how to test C"^ using projections to C^. 

4.1 Composition 

Recall that a Tanner Product Code is given by a pair (G, Cgmaii)- We start by defining a composition 
of graphs that corresponds to the composition of codes. 

Given an (A^, M, Z))-ordered graph G = {ii, ■ ■ ■ ,£m} and an additional (D, m, (i)-ordered graph 
G' = {£[,... ,£'^}, their Tanner Composition, denoted G@G' , is an {N,M ■ m, d)-ordered graph 
with adjacency lists {£jjr\j € [M],j' G [m]}, where C-'^j ji) j = £j,£',, • 

Lemma 4.1 (Composition) Let Gi be an [N, M, D)- ordered graph, and Gi C S-^ he a linear 
code with G = TPC(Gi,Ci). Further, let G2 be an {D,m,d)- ordered graph and G2 C.T,'^ be a linear 
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code such that Ci = TPC(G2, C2). Then C = TPC(Gi(c)G2, C2) (giving a d-query local test for C ). 
Furthermore if {Gi,Ci) is ci-robust and (6^2,(^2) is C2-robust, then (Gi(c)G2,C2) is ci • C2-robust. 

Proof: We focus on the robustness of the C, as aU other claims follow immediately from defini- 
tions. Assume w G has distance 6 from C. Then, since C = TPC(Gi,Ci) is ci-robust, the 
expected distance of a random "medium" -size test (of query size D) is at least Sci, so by the C2- 
robustness of Ci = TPC(G2, C2) the expected distance of the "smalF'-size test (of query complexity 
d) is at least 6ci ■ C2 as claimed. 

I 



4.2 Testing a 4- Wise Tensor Product Code 

We continue by recasting the results of Section 3 in terms of robustness of associated Tanner 
Products. Recall that denotes the graph that corresponds to the tests of by the m-Product 
Tester, where C C S". 

Note that can be composed with and so on. For m' < m, define ^, = if m' = m—1 

and define G;^,„, = G;^©G;^_i,^/ otherwise. Thus we have that G"^ = TPC(G;^_^,, G"^'). The 
following lemma (which follows easily from Theorem 2.6 and Lemma 4.1 gives the robustness of 

(G2,2,C2). 

Lemma 4.2 Let C be an [n,k,d]j] code with (d — < |. Then {G^^2^C'^) is 2^-^'^ -robust. 

Proof: Since we have [d — > | we may apply Theorem 2.6 with m = 3,4 to get that 

(G2,G3) and (G^,G2) are both 2-i6-robust. Since G^ = TPC(G^,G2), we may apply Lemma 4.1 
to conclude that (GJa = G4©G^, G^) is 2-32-robust. I 

4.3 Testing Tensor Products with tests 

Finally we define graphs so that G^* = TPC(//", G^). This is easily done recursively by letting 
H2 = G2,2 and letting = G2$ ^ ©HJ}_^ for t > 2. We now analyze the robustness of (i?f ,G2). 

Lemma 4.3 There exists a constant a > such that the following holds: Let t be an integer and 
C be an [n, k, djs code such that d — 1 > (1 — • n, for m = 2*. Then (-ff", G^) is -robust. 

Proof: Note that the condition in the lemma implies {{d — l)/n)"^ > (1 — s^)'" ^ e~'^-^ > |. 
This is the form in which we use the condition. 

We prove the lemma, for a = 2~^^, by induction. For the base case, wc have {H2 = G4 2,G^) is 
2~^^-robust, by Lemma 4.2. (Here we use the fact that (d — 1/n)^ ^ | as needed.) 
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For the induction, let m = 2*. and let C = C""/^. Let d = Ci = (C')^ G2 = and 

C2 = Note that HI' = Gi©G2 and Ci = TPC(G2, C2). Thus we can bound the robustness of 
{H^,C2) by bounding the robustness of (Gi,Ci) and {G2,C2) and then using Lemma 4.1. Note 
that Ci = (Cf and C" is a [n"*/^, A;'"/^, d^^/^js code, where 

/ rf"/^- i V ^ /^l^A™ > I 

n'"/^ J - V n y -8' 

Thus we can apply Lemma 4.2 to conclude (Gi,Ci) = (G4 2, (G')^) is a-robust for a = 2"^^. By 
induction, we also have (G2,G2) = {H^_i,C^) is a*~^-robust. By Lemma 4.1, (Gi(c)G2,G2) is 
Q;*-robust. I 

We are ready to prove Theorem 2.7. 

Proof of Theorem 2.7: Let a be the constant given by Lemma 4.3. Fix i and let C = Ci, n = rii 
etc. (i.e., we suppress the subscript i below). Then C"^ is an [N, K, D]q code, for = n™, K = 
and D = dP^. Since d/n > 1 — 2m , we have G™ has relative distance d^/n"' > ^. Furthermore, 
the rate of the code is inverse polynomial, i.e., N = vJ^ = {pik))""' < poly(/c"') = poly(iir). Finally, 
we have G"^ = TPC(i7[^g^^, G^), where {Hl^g^^,C^) is an a^°S2 "^-robust tester for C"" and this 

tester has query complexity 0{n^). Prom Proposition 2.3 we get that there is a tester for G that 
makes 0{n'^/ a'-^(^°^2 ) = poly log N queries. I 
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